Fourier-transform based linear equalization for MIMO CDMA downlink

ABSTRACT

In the reception of a downlink MIMO CDMA signal, the receiving unit performs a simplified process of linear equalization that eliminates the need for inverting the correlation matrix. The correlation matrix is approximated to a good degree by a circulation matrix that is diagonalized by FFT operations, thus substituting two FFTs and one IFFT having a complexity of 
     
       
         
           
             O 
             ( 
             
               
                 
                   
                     L 
                     F 
                   
                    
                   
                     ( 
                     
                       N 
                        
                       
                           
                       
                        
                       Δ 
                     
                     ) 
                   
                 
                 3 
               
               + 
               
                 
                   
                     
                       
                         ( 
                         
                           N 
                            
                           
                               
                           
                            
                           Δ 
                         
                         ) 
                       
                       2 
                     
                     + 
                     
                       2 
                        
                       
                         ( 
                         
                           N 
                            
                           
                               
                           
                            
                           Δ 
                         
                         ) 
                       
                     
                   
                   2 
                 
                  
                 
                   L 
                   F 
                 
                  
                 
                   log 
                   2 
                 
                  
                 
                   L 
                   F 
                 
               
             
             ) 
           
         
       
     
     for the direct matrix inversion having a complexity of O(L F   3 ).

RELATED APPLICATIONS

This application is a Continuation in Part of U.S. application Sequence Ser. No. 10/436,618, filed on May 13, 2003, assigned to the assignee hereof and incorporated herein by reference.

FIELD OF THE INVENTION

The invention relates to a MIMO reception method in mobile CDMA telephone systems wherein a desired signal is separated from other interfering signals by means of a linear equalization algorithm that avoids matrix inversion.

BACKGROUND OF THE INVENTION

A central problem in designing and implementing a data transmission system is simultaneous transmission and reception of signals from several simultaneous users such that the signals interfere with one another as little as possible. Because of this and the transmission capacity used, various transmission protocols and multiple access methods have been used, the most common especially in mobile phone traffic being FDMA (Frequency Division Multiple Access) and TDMA (Time Division Multiple Access), and recently CDMA (Code Division Multiple Access).

CDMA is a multiple access method based on a spread spectrum technique, and it has been recently put into use in cellular radio systems in addition to previously used FDMA and TDMA. CDMA has many advantages over the prior methods, such as simplicity of frequency planning, and spectrum efficiency.

In a CDMA method, a narrow-band data signal of a user is multiplied to a relatively broad band by a spreading code having a much broader band than the data signal. Band widths used in known test systems include e.g. 1.25 MHz, 10 MHz and 25 MHz. The multiplication spreads the data signal over the entire band to be used. All the users transmit simultaneously on the same frequency band. A different spreading code is used on each connection between a base station and a mobile station, and the signals of the users can be distinguished from one another in the receivers on the basis of the spreading code of the user. If possible, the spreading codes are selected in such a way that they are mutually orthogonal, i.e. they do not correlate with one another.

Correlators in conventionally implemented CDMA receivers are synchronized with a desired signal, which they recognize on the basis of the spreading code. In the receiver the data signal is restored to the original band by multiplying it by the same spreading code as in the transmission step. Ideally, the signals that have been multiplied by some other spreading code do not correlate and are not restored to the narrow band. In view of the desired signal, they thus appear as noise. The object is to detect the signal of the desired user from among a number of interfering signals. In practice, the spreading codes do correlate to some extent, and the signals of the other users make it more difficult to detect the desired signal by distorting the received signal. This interference caused by the users to one another is called multiple access interference.

The situation is especially problematic when one or several users transmit with a considerably greater signal strength than the other users. These users employing greater signal strength interfere considerably with the connections of the other users. Such a situation is called a near-far problem, and it may occur for example in cellular radio systems when one or several users are situated near the base station and some users are further away, whereupon the users that are situated closer blanket the signals of the other users in the base station receiver, unless the power control algorithms of the system are very fast and efficient.

The reliable reception of signals is problematic especially in asynchronous systems, i.e. systems where the signals of the users are not synchronized with one another, since the symbols of the users are disturbed by the several symbols of the other users. In conventional receivers, filters matched with the spreading codes, and sliding correlators, which are both used as detectors, do not function well in near-far situations, however. Of the known methods the best result is provided by a decorrelating detector, which eliminates multiple access interference from the received signal by multiplying it by the cross-correlation matrix of the spreading codes used. The decorrelating detector is described in greater detail in Lupas, Verdu, ‘Linear multiuser detectors for synchronous code-division multiple access channels’, IEEE Transactions on Information Theory, Vol. 35, No. 1, pp. 123-136, January 1989; and Lupas, Verdu, ‘Near-far resistance of multiuser detectors in asynchronous channels’, IEEE Transactions on Communications, Vol. 38, April 1990. These methods, however, also involve many operations, such as matrix inversion operations, that require a high calculating capacity and that are especially demanding when the quality of the transmission channel and the number of the users vary constantly, as for example in cellular radio systems.

Channel equalization is a promising means of improving the downlink receiver performance in a frequency selective CDMA downlink. Current research encompasses two types of linear equalization, namely non-adaptive linear equalization and adaptive linear equalization. Non-adaptive linear equalizers usually assume “piece-wise” stationarity of the channel and design the equalizer according to some optimization criteria such as LMMSE (Least Mininum Mean Squared Error) or zero-forcing, which in general leads to solving a system of linear equations by matrix inversion. This can be computationally expensive, especially when the coherence time of the channel is short and the equalizers have to be updated frequently. On the other hand, adaptive algorithms solve the similar LMMSE or zero-forcing optimization problems by means of stochastic gradient algorithms and avoid direct matrix inversion. Although computationally more manageable, the adaptive algorithms are less robust since their convergence behavior and performance depend on the choices of parameters such as step size.

Multiple transmit, multiple receive systems have been employed in various contexts and it has been shown that in an independent flat-fading environment, the capacity of an MIMO system increases linearly with the number of antennas.

Applying a MIMO configuration to the CDMA downlink presents a significant challenge to the receiver designer because the receiver has to combat both inter-chip interference (ICI) and co-channel interference (CCI) in order to achieve reliable communication.

The art still has need of an equalization procedure that is robust and does not consume a great deal of computation power.

SUMMARY OF THE INVENTION

The purpose of the present invention is to provide an equalization method for downlink MIMO CDMA signals that avoids a computationally intense matrix inversion.

A feature of the invention is a linear filter process using only FFTs and IFFTs as steps in the filter coefficient generation process.

A feature of the invention is the approximation of the correlation matrix with a block circulant matrix that is diagonalized by a DFT operation.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 shows a block diagram of a receiver for the general case.

FIG. 2 shows various equations used in the analysis of the invention.

FIG. 3 shows an overall view of a MIMO system according to the invention.

FIG. 4 shows a counterpart figure to FIG. 1 for the MIMO case.

FIG. 5 shows a comparison of an exact solution with a solution according to the invention for a QPSK example.

FIG. 6 shows a comparison of an exact solution with a solution according to the invention for a 16QAM example.

FIG. 7 shows a flow chart of the processing steps according to the invention.

BEST MODE OF CARRYING OUT THE INVENTION

In a preliminary discussion covering both single input/single output (SISO) and multiple input/multiple output (MIMO), consider the case of a CDMA downlink with at least one (M) antenna and J active users each of which is assigned a number of codes Kj; for j=1 - - - J. Let K be the total number of active spreading codes (summed over K_(j)). Note that in our discussion, we use spreading code index, rather than user index, to simplify the notation. At the transmitter, the chip-level signal representation is given by Eq (1) in FIG. 2, where i, m and k are chip, symbol and spreading code indices. The base station scrambling code is denoted by c(i). Meanwhile, ak stands for the power assigned to spreading code k, bk is the information symbol sequence for spreading code k and sk(i) is the spreading code k.

Let h=[hO; :::hL] be the composite chip-level channel impulse vector of spreading code k. Note h includes the contributions from transmit pulse shaper, the wireless propagation channel and the receive filter, so that it will change as the environment changes. Also note that since we only consider spreading code k throughout our discussion, we use h instead of hk for clarity. The matrix-vector representation of the received signal is given in equation 2 in FIG. 2. To facilitate the discussion of linear equalization, we stack F+1 chips in the received vector r so that r(i)=[r(i+F); :::; r(i); :::, r(i−F)].=H(i)d(i)+n(i), where d(i)=E[d(i)d^(H)(i)] is the transmitted chip power and h(i) is the (F+1)th column in H(i). The solution in this form is not desirable, since it depends on the chip index i and is time-varying. However, the dependence on i can be removed if the following two assumptions hold:

-   a) The channel vector h(i) is stationary over a block of chips. This     condition is satisfied by choosing the block size such that the time     span of the block is a small fraction of the channel coherence time.     With this condition, the dependence on i is removed from h(i) and     H(i). -   b) The chip-level transmitted signal d(i) is white and wide sense     stationary. It can be shown that this condition is strictly     satisfied if the system is fully loaded, i.e, when K=G, and each     spreading code is assigned equal power. Otherwise, this condition     holds reasonably well except for very lightly loaded systems, i.e,     when K<<G. The following solution is thus counterintuitive, since it     is better at small signal to noise ratios than when the conditions     are conventionally “better”; i.e. a signal stands out cleanly from     the background.

Removing the dependence on time, the solution to the filter vector w becomes Equation 4 in FIG. 2, where sigma is a constant representing the transmitted power, and R is the correlation matrix from Eq 3. Those skilled in the art will be aware that the estimated data after equalization are represented by d(i)=w^(H)r(i), where r is the received signal in Equation 2 and w is relatively slowly varying. It has been observed that, as shown in Eq 5, that R is banded Toeplitz in form, with individual elements given by Eq 6 that depend on the channel impulse vector h and some constants.

Those skilled in the art are aware that the analytic solution for w to the previous problem (expressing w in terms of other observed parameters) requires inverting the correlation matrix R. The inversion calculation requires computational resources and time. Providing the required computation resources in a mobile telephone handset is difficult, as is performing the calculations with limited hardware resources quickly enough to provide a satisfactory solution. Thus, the invention is well adapted to use in the receiver of a mobile handset in a CDMA cellular system.

The complexity of the matrix inversion is of the order of L_(F) ³, where L_(F)=2F+1 is the filter length. Further, the matrix inversion operation can be numerically unstable and inaccurate in the frequent case of fixed-point implementations.

It is an advantageous feature of the invention that matrix inversion is avoided by a process in which matrix inversion is replaced by Fourier transforms. In the preferred embodiment of the invention, the inversion of the correlation matrix is replaced by two FFTs (Fast Fourier Transform) and an inverse FFT.

If L_(F)>2L, we can convert R into a circulant matrix S by the addition of a matrix C according to Eq 7, where C is an upper triangular “corner” matrix defined in Eq 2. The purpose of this change is to take advantage of the property that every circulant matrix can be diagonalized by a DFT (Discrete Fourier Transform) matrix, i.e. S=D^(H)(Λ) D, where D is defined in Eq 9 and A is a diagonal matrix that is obtained by taking a DFT on the first column of S.

Defining V according to Eq 10, those skilled in the art will appreciate that the problem of inverting the L_(F)×L_(F) matrix R has been reduced to inverting the 2L×2L matrix J_(2L)−VS⁻¹V^(H), where J_(2L) is a 2L×2L “exchange” matrix (ones on anti-diagonals).

Further, if the filter length is much longer than the channel correlation length, i.e. L_(F)>>2L, then adding the two corners to the correlation matrix R does not significantly change the eigenstructure of the matrix. Accordingly, the inverse of R is approximately equal to the inverse of S. Therefore no direct matrix inversion is necessary since the inverse of S can be obtained with some FFT and IFFT operations.

Returning to the problem of isolating the desired signal, the solution becomes w=S⁻¹h=D^(H)(Λ)⁻¹Dh, where the D and D^(H) operations represent DFT and IDFT operations, respectively. As yet another simplification, the DFT operations can be replaced by computationally simpler FFT operations.

The signal recognition process then becomes:

Estimate the correlation matrix R from the received signal; Convert R to the circulant matrix S by adding the two corner matrices; Take FFT(s), where s is the first column of S and generate Λ;

Calculate Dh=Fft(h) and (Λ)⁻¹Dh, and

Transform back into the time domain where w=D^(H) (Λ)⁻¹Dh=IFFT((Λ)⁻¹Dh); Apply the resulting w to the received vector r to calculate the estimated chip d.

The elements of the quantity (Λ)⁻¹Dh will also be referred to as the frequency domain filter taps. The estimated chip d is then processed conventionally to generate the analog voice signal (or data).

Since the filter is unchanged for a block of N chips, the calculation load per chip is normalized by N. N may be, illustratively, 1024. The overall per-chip complexity then becomes of order (L_(F)+(3L_(F)/2N) log 2L_(F)), which compares favorably with the complexity of order (L_(F)+(1/N)L_(F) ³) for the direct matrix inversion method.

Referring now to FIG. 1, there is shown a block diagram of a generalized receiver, illustratively a mobile handset in a cellular CDMA system, in which antenna 105 receives incoming signals, which pass to channel estimator 110, which generates an initial estimate for parameters used in the calculations, and also passes to equalizer 120, which represents the circuits that perform the various calculations discussed below. in this algorithm, the process of estimating the correlation matrix elements is performed according to any convenient conventional method such as illustrated in the book “Statistical Signal Processing” by Louis Scharf, Addison Wesley. The calculations may be carried out in a special-purpose device, including a digital signal processor chip and/or a general purpose device such as a microprocessor. The instructions for carrying out the process may be stored in any convenient medium, e.g. a read-only memory chip readable by a machine.

The function of the equalizer is to partially or largely restore the orthogonality of the separate spreading codes representing the various “channels”, one for each user.

After the equalizer, a conventional code correlator, as known to those skilled in the art, such as that shown in the book “Digital Communications” by John Proakis, McGraw Hill, [g1]separates out the power associated with the particular code that is carrying the data of interest. A conventional deinterleaver selects the particular data of interest. Box 150, labeled Audio, represents schematically the conventional circuits that convert the digital signals processed up to this point to analog audio, (or in the case of data, pass the data on to the next step). For convenience in expressing the claims, the signal leaving the deinterleaver 140 will be referred to as the output signal and the processes represented by box 150 (summing a block of data, performing digital to analog conversion, smoothing, amplifying, etc.) will be referred to as processing the output signal.

Numerical Calculation Techniques

Two calculation techniques have been found to improve the accuracy of the approximation used and the stability of the results. Adding an artificial noise floor to the matrix S by adding a unit matrix multiplied by a small constant prevents dividing by a small number when the eigenvalues of the matrix are used as divisors in the FFT. This is equivalent to assuming that the noise is worse than it really is.

In addition, since the length of the impulse vector h is a constant fixed by the channel profile, we can improve the accuracy of the approximation by increasing the filter length L_(F). This has the effect of reducing inaccuracies introduced by adding in the corner matrix CL when the eigenvalues are calculated. Since increasing the filter length means higher filter complexity, a better tradeoff is provided by using a double length (2L_(F)) vector while performing calculations in the frequency domain. The initial set of chips in the received vector is expanded to length 2L_(F). This expanded vector is transformed to the Fourier domain and used for calculations. After the inverse Fourier Transform, the extra L_(F)/2 taps on the two sides are truncated and only the L_(F) taps in the center are used.

Multi-Channel Diversity

Multi-channel diversity reception is an important means of improving receiver performance. The benefit of diversity reception is two-fold: first, the outage probability is reduced since the chance that all the diversity branches experience deep fade is smaller; second, the added diversity branches provide additional signal dimension that can be used in enhancing the SNR, suppressing the ISI and MAI, etc.

Multi-channel diversity reception manifests itself in many forms. Among them, oversampling, multiple receive antennas, and antenna polarization are the most commonly used.

The performance of these methods critically depend on the statistical correlation between different diversity branches. In general, the smaller the correlation between different diversity branches the better the overall receiver performance.

In this section, we extend our FFT based linear equalization method to single-antenna systems with diversity receptions. The following treatment does not distinguish between different diversity methods since they all share the same mathematical form. To this end, let M denote the total number of diversity branches (typically 2 or 4) and we extend the received signal model of Eq 2 by substituting a small vector h_(i) for the scalar h_(i) of the previous discussion.

The correlation matrix is again banded block Toeplitz, with the change that the elements are now small matrices, as shown in eq 11 and 12. The problem of solving the matrix equation for the signal vector w is made more complex because the correlation matrix R is now ML_(F)×ML_(F) and correspondingly more difficult to invert directly.

The procedure of the previous section is followed by approximating the block Toeplitz matrix R with a block circulant matrix S. In order to invert S, we introduce a cyclic shift matrix P according to Eq 13, where I is the identity matrix. S, then, can be represented as Eq 14, where the symbol {circle around (x)} denotes a Kronecker product and E₀ - - - E_(LF−1) form the first “block” column in matrix S. Proceeding analogously to the previous discussion, P may be diagonalized by a DFT P=D^(H)WD, where D is the DFT matrix and W is diagonal of the form W=diag (1, W_(LF) ⁻¹, . . . , W_(LF) ^(−(LF−1))), with W_(LF)=e_(j(2pi/LF)). After some substitution, S may be expressed as Eq 15, where the expression 15-1 denotes dimension-wise IDFT and expression 15-3 denotes dimension-wise DFT, meaning that the DFT or IDFT is applied on each of the M diversity dimensions. The central expression 15-2 is a block diagonal matrix whose diagonal blocks are the element-wise Dft of the array of matrices E₀, . . . , E_(LF−1), as expressed in eq 16, where F is an M×M matrix defined by eq 17. The inverse of S is therefore given by Eq 18. The inversion of F reduces to the inversion of L_(F) small M×M matrices, since F is block diagonal.

The procedure for multi-dimensional transmission can be summarized as:

Estimate the correlation matrix R from the received signal. Convert to the block circulant matrix S by adding two “corners”

-   -   3) Take an “element-wise” FFT on the first “block” column of S         and form F, invert and get F⁻¹     -   4) Calculate “dimension-wise” FFT of h, or (D{circle around         (x)}I)h and F⁻¹(D{circle around (x)}I)h     -   5) Calculate “dimension-wise” IFFT of F⁻¹(D{circle around (x)}I)         to get the weight vector w=(D^(H){circle around         (x)}I)F⁻¹(D{circle around (x)}I)h

This algorithm involves one “dimension-wise” FFT and IFFT on a vector of size ML_(F)×1 (equivalent to M FFT/IFFTs of length L_(F)), one “element-wise” FFT on a matrix of size M×M (which is equivalent to M² FFTs of length L_(F)) and L_(F) matrix inversions of size M×M. The complexity of this algorithm is of the order (L_(F)M³+(M²+2M)/2L_(F) log 2LF), compared with the much higher complexity of order (ML_(F))³ of a direct matrix inversion of R.

Multiple transmit, multiple receive antenna MIMO systems offer potential for realizing high spectral efficiency of a wireless communications system.

Applying MIMO configuration to the CDMA downlink presents significant challenge to the receiver design, as the receiver has to combat both the inter-chip interference (ICI) and the CCI in order to achieve reliable communication. Those skilled in the art are aware that both the conventional LMMSE algorithm and the Kalman filter algorithm can be extended to the MIMO system. Attempts to combine the non-linear decision feedback interference cancellation with the LMMSE equalization are also found in the literature. However, these algorithms perform the decision feedback directly at the received signal, and thus require the impractical assumption that all the active Walsh codes are known at the mobile receiver in order to reconstruct the transmitted chip sequences.

Consider an M transmit antenna, N receive antenna MIMO CDMA system as illustrated in FIG. 3. The input data enter on line 302 and are converted from serial to parallel in unit 310. We have assumed a rather simple “serial to parallel split” transmit multiplexing, in order to make our receiver solutions general enough for all possible MIMO transmit multiplexing methods. The modulated symbol streams are split in units 310-1 to 310-M at the transmitter into M sub-streams before being transmitted across the M transmit antennas 315.

Input antennas 325 pick up the signals that are detected by detectors 330-I and processed by unit 350.

As shown in FIG. 4, the signal model at the m_(th) transmit antenna is given as follows, assuming K active Walsh codes in the system:

$\begin{matrix} {{d_{m}(i)} = {{c(i)}{\sum\limits_{k = 1}^{K}{\sum\limits_{m}{\alpha_{k}{a_{k,m}(j)}{s_{k}\left( {i - {j\; G}} \right)}}}}}} & (1) \end{matrix}$

where i,j,m and k are chip, symbol, transmit antenna and spreading code indices. The base station scrambling code is denoted by c(i). Meanwhile, α_(k) stands for the power assigned to spreading code k (same for all antennas), α_(k,m) is the information symbol sequence for spreading code k at antenna m and s_(k) is the k_(th) spreading code. Note that in this model we have implicitly assumed that the same set of Walsh codes are used across all the transmit antennas.

The transmitted signals propagate through the MIMO multipath fading channel denoted by H₀, . . . , H_(L), where each matrix H_(I) is of dimension NΔ×M, where Δ denotes number of samples per chip.

The signal model at the receive antennas are thus given by the following equation, after stacking up the received samples across all the receive antennas for the i^(th) chip interval.

$\begin{matrix} {y_{i} = {{\sum\limits_{l = 0}^{L}{H_{l}d_{i - l}}} + {n_{i}.}}} & (2) \end{matrix}$

Note that y_(i)=[y_(i,l) ^(T), . . . , y_(i,N) ^(T)]^(T) is of length NΔ, and each small vector y_(i,n) includes all the temporal samples within the i^(th) chip interval. Meanwhile, L is the channel memory length, d_(i−l)=[d₁(i−l), . . . , d_(M)(i−l)]^(T) is the transmitted chip vector at time i−l, and n_(i) is the (NΔ)×1 dimensional white Gaussian noise vector with n_(i)˜N(0, σ²I). Note that σ² denotes noise variance and I is the identity matrix. Furthermore, in order to facilitate the discussion on the LMMSE receiver, we stack up a block of 2F+1 received vectors:

y _(i+F:i−F) =Hd _(i+F:i−F−L) +n _(i+F:i−F)  (3)

where 2F+1 is the length of the LMMSE equalizing filter and

y_(i + F : i − F) = [y_(i + F)^(T), …  , y_(i − F)^(T)]^(T),  ((2F + 1)N Δ × 1) n_(i + F : i − F) = [n_(i + F)^(T), …  , n_(i − F)^(T)]^(T),  ((2F + 1)N Δ × 1) d_(i + F : i − F − L) = [d_(i + F)^(T), …  , d_(i − F − L)^(T)]^(T),  ((2F + L + 1)M × 1) ${H = \begin{bmatrix} H_{0} & \ldots & H_{L} & \; & \; \\ \; & \ddots & \; & \ddots & \; \\ \; & \; & H_{0} & \ldots & H_{L} \end{bmatrix}},\mspace{14mu} \left( {\left( {{2F} + 1} \right)N\; \Delta \times \left( {{2F} + L + 1} \right)M} \right)$

where the dimensions of the matrices are given next to them. Note that to keep the notation more intuitive, we keep the subscripts at a “block” level. For instance, y_(i+F:i−F) is the vector that contains blocks y_(i+F, . . . , y) _(i−F) where each block is a vector of size NΔ×1.

The block diagram of the MIMO receiver with chip-level equalizer is shown in FIG. 4. The signals are received on antennas 405-1 through 405-N. Channel estimator 410 proceeds analogously to estimator 110 in FIG. 1 and estimates the parameters for the multiple channels. The chip-level equalizer 420 processes the raw signals in the light of input from the estimator after which, the orthogonality of the Walsh code is partially re-installed. Descrambler 430 detects the desired symbols from each transmit antenna which correlates to the desired spreading code. Note the descrambling process is also included in the code correlator. 430 Lastly, unit 440 performs the functions of deinterleaving and decoding.

All of these blocks operate on the N input signals from the N receive antennas. For example, the single lines in the diagram represent a set of lines that carry the signals in parallel.

Defining an error vector of z=d_(i)−W^(H)y_(i+F:i−F) and an error covariance matrix R_(zz)=E[zz^(H)], the MIMO LMMSE chip-level equalizer W is the solution of the following problem:

$\begin{matrix} {{W^{opt} = {{\arg \; {\min\limits_{W}\mspace{14mu} {{Trace}\left( R_{zz} \right)}}} = {\arg \; {\min\limits_{W}{E{{d_{i} - {W^{H}y_{i + {F:{i - F}}}}}}^{2}}}}}},} & (6) \end{matrix}$

whose optimal solution is given by:

w^(opt)=σ_(d) ²R⁻¹H_(i:i).  (7)

where R=E[y_(i+F:i−F)y_(i+F:i−F) ^(H)] is the correlation matrix of the received signal, σ_(d) ² is the transmitted chip power. Meanwhile, although H is fixed for a given channel realization and is not a function of symbol index i, here we use the notation H_(i+F:i+1), H_(i:i) and H_(i−1:i−F−L) to represent the sub-matrices of H that are associated with d_(i+F:i+1), d_(i) and d_(i−1:i−F−L) in the expansion of the matrix-vector product:

$\begin{matrix} {{Hd}_{i + {F:{i - F}}}\overset{\Delta}{=}{{\left\lbrack {H_{i + {F:{i + 1}}}\mspace{14mu} H_{i:i}\mspace{14mu} H_{{i - 1}:{i - F - L}}} \right\rbrack \begin{bmatrix} d_{i + {F:{i + 1}}} \\ d_{i} \\ d_{{i - 1}:{i - F - L}} \end{bmatrix}} = {{H_{i + {F:{i + 1}}}d_{i + {F:{i + 1}}}} + {H_{i:i}d_{i}} + {H_{{i - 1}:{i - F - L}}{d_{{i - 1}:{i - F - L}}.}}}}} & (10) \end{matrix}$

The MIMO LMMSE solution involves the inversion of the correlation matrix of received signal R, which has a complexity of O((L_(F)NΔ)³) where L_(F)=(2F+1) is the temporal length of the filter. This complexity grows rapidly as the filter length increases, rendering the direct LMMSE method impractical, especially for fast-fading channel situations that require frequent filter update. To reduce the complexity of the LMMSE algorithm, an FFT based method was proposed in the parent case to avoid the direct matrix inversion in the SISO/SIMO LMMSE equalization. We show here that this FFT-based low complexity approach can be extended to our MIMO system of interest, and name the overall algorithm MIMO LMMSE-FFT.

We start by stating that for the quasi-stationary received signal of interest, the correlation matrix assumes the following block Toeplitz structure:

$\begin{matrix} {R = \begin{bmatrix} E_{0} & \ldots & E_{L}^{H} & \; & 0 \\ \vdots & \ddots & \; & \ddots & \; \\ E_{L} & \; & \ddots & \; & E_{L}^{H} \\ \; & \ddots & \; & \ddots & \vdots \\ 0 & \; & E_{L} & \ldots & E_{0} \end{bmatrix}} & (11) \end{matrix}$

where in case of white noise, each E_(l) is a small NΔ×NΔ matrix given by

${E_{l} = {{\sigma_{d}^{2}{\sum\limits_{i = 0}^{L}{H_{i}H_{i - l}^{H}}}} + {\delta_{l}\sigma_{n}^{2}I}}},\mspace{14mu} {l = 0},\ldots \mspace{11mu},L$

We are again interested in solving W=σ_(d) ²R⁻¹H_(i:i). Note that R is a (L_(F)NΔ)×(L_(F)NΔ) matrix. Although the direct inversion of such a large matrix is difficult, we show that with the circulant approximation and help of FFT operations, we are able to reduce this complex problem of inverting an (L_(F)NΔ)×(L_(F)NΔ) matrix into a much simpler problem of inverting L_(F) small matrices of size NΔ×NΔ.

Following a procedure similar to that for the SISO case, we can approximate the block Toeplitz matrix R by a block circulant matrix S. Furthermore, in order to invert S, we define cyclic shift matrix P of size L_(F)×L_(F):

$\begin{matrix} {P = {\begin{bmatrix} 0 & 1 \\ I & 0 \end{bmatrix}.}} & (12) \end{matrix}$

Where I is the identity matrix. With this definition, it can be shown that S admits the following polynomial representation:

S=I{circle around (x)}E ₀ +P{circle around (x)}E ₁ + . . . +P ^(L) ^(F) ⁻¹ {circle around (x)}E _(L) _(F) ⁻¹.  (13)

Note that {circle around (x)} denotes Kronecker product and E₀, . . . , E_(L) ⁻¹ form the first “block” column in matrix S. Since P is a circulant matrix, it is well-known that P admits the following form of diagonalization:

P=D^(H)UD  (14)

where D is the DFT matrix and U is diagonal. Furthermore, for this special case it can be shown that U=diag(1,U⁻¹ _(L) _(F), U^(−(L) ^(F) ⁻¹⁾ _(L) _(F) with U_(L) _(F) =e^(j(2π/L) ^(F) ⁾. Substituting (14) into (13) we get:

$\begin{matrix} \begin{matrix} {S = {\sum\limits_{i = 0}^{L_{F} - 1}{\left( {D^{H}{UD}} \right)^{i} \otimes E_{i}}}} \\ {= {\sum\limits_{i = 0}^{L_{F} - 1}{\left( {D^{H}U^{i}D} \right) \otimes E_{i}}}} \\ {= {\left( {D^{H} \otimes I} \right)\left( {\sum\limits_{i = 0}^{L_{F} - 1}{U^{i} \otimes E_{i}}} \right){\left( {D \otimes I} \right).}}} \end{matrix} & (15) \end{matrix}$

Note that we have used the identity (A{circle around (x)}B)(D{circle around (x)}G)=(AD){circle around (x)}(BG). In (15), D{circle around (x)}I and D^(H){circle around (x)}I define dimension-wise DFT and IDFT, respectively, meaning the DFT or IDFT is applied on each of the M diversity dimensions. On the other hand, Σ_(i=0) ^(L) ^(F) ⁻¹U^(i){circle around (x)}E_(i) is a block diagonal matrix whose diagonal blocks are the element-wise DFT of the array of matrices E₀, . . . , E_(L) _(F) ⁻¹. Or mathematically,

$\begin{matrix} {{F \equiv {\sum\limits_{i = 0}^{L_{F} - 1}{U^{i} \otimes E_{i}}}} = \begin{bmatrix} F_{0} & \; & 0 \\ \; & \ddots & \; \\ 0 & \; & F_{L_{F} - 1} \end{bmatrix}} & (16) \end{matrix}$

where F_(k) (k=0, . . . L_(F)−1) is a NΔ×NΔ matrix defined by:

$\begin{matrix} {{F_{k} = {\sum\limits_{i = 0}^{L_{F} - 1}{U^{{- i} \cdot k} \otimes E_{i}}}},\mspace{20mu} {k = 0},\ldots \mspace{11mu},{L_{F} - 1.}} & (17) \end{matrix}$

Finally, the inverse of S is given by:

S ⁻¹=(D ^(H) {circle around (x)}I)F ⁻¹(D{circle around (x)}I)  (18)

with the help of the identity (N{circle around (x)}M)⁻¹=N⁻¹{circle around (x)}M⁻¹. Note in this case, the inversion of F boils down to the inversion of L_(F) small NΔ×NΔ matrices since it is block diagonal. Finally, the optimal filter matrix W^(opt) is given by

$\begin{matrix} \begin{matrix} {W^{opt} = {\sigma_{d}^{2}R^{- 1}H_{i:i}}} \\ {\approx {\sigma_{d}^{2}S^{- 1}H_{i:i}}} \\ {= {{\sigma_{d}^{2}\left( {D^{H} \otimes I} \right)}{F^{- 1}\left( {D \otimes I} \right)}{H_{i:i}.}}} \end{matrix} & (19) \end{matrix}$

The overall filter design algorithm for the multi-channel system may be summarized in FIG. 7 as:

Estimate correlation matrix R from the received signal.

Convert to the block circulant matrix S by adding the two “corners”.

Take element-wise FFT on the first “block” column of S and form F, invert and get F⁻¹.

Calculate “dimension-wise” FFT of H_(i:i), or (D{circle around (x)}I)H_(i:i) and furthermore F⁻¹(D{circle around (x)}I)H_(i:i).

Finally, calculate “dimension-wise” IFFT of F⁻¹(D{circle around (x)}I)H_(i:i) to get the weight vector W=σ_(d) ²(D^(H){circle around (x)}I)F⁻¹(D{circle around (x)}I)H_(i:i).

The algorithm stated above involves one “dimension-wise” FFT and IFFT on a vector of size NΔL_(F)×1 (which is equivalent to NΔ FFT/IFFTs of length L_(F)) one “element-wise” FFT on matrix of size NΔ×NΔ (which is equivalent to (NΔ)² FFTs of length L_(F)), and L_(F) matrix inversions of size NΔ×NΔ. The complexity of this algorithm is

${O\left( {{L_{F}\left( {N\; \Delta} \right)}^{3} + {\frac{\left( {N\; \Delta} \right)^{2} + {2\left( {N\; \Delta} \right)}}{2}L_{F}\log_{2}L_{F}}} \right)},$

compared with the much higher complexity of O((NΔL_(F))³) of a direct matrix inversion of R.

Simulation of MIMO Results

ParameterName QPSK Case 16 QAM Case System CDMA CDMA 1X/EVDV Spreading Length 32 32 Channel Profile Vehicular A Vehicular A Mobile Speed 50 km/hr 50 km/hr Filter Length 32 32 Number of Tx/Rx Antennas 2/2 2/2 Modulation Format QPSK 16 QAM Information Data Rate 312 kbps 163.2 kbps Turbo Code Rate 0.6771 0.5313 Geometry 6 10 Number of Walsh Codes 3 1 Assigned to the user Total Number of Walsh 25 25 Codes in the System

Although the invention has been described with respect to a limited number of embodiments, those skilled in the art will appreciate that other embodiments may be constructed within the spirit and scope of the following claims. 

1-36. (canceled)
 37. A method comprising: estimating a channel correlation matrix R of a received signal vector r(i); generating a filter matrix w without inverting the channel correlation matrix R; applying the generated filter matrix w to the received signal vector r(i) to find an estimated chip d(i); and outputting one of an audio signal or data from the estimated chip d(i).
 38. The method of claim 37, wherein generating the filter matrix w without inverting the channel correlation matrix R comprises: converting the channel correlation matrix R to a block circulant matrix S; and obtaining an inverse of the block circulant matrix S via Fourier transform operations.
 39. The method of claim 38, wherein converting the channel correlation matrix R to the block circulant matrix S comprises adding corner matrices to the channel correlation matrix R.
 40. The method of claim 39, wherein the corner matrices are Hermitian conjugates of one another.
 41. The method of claim 38, wherein obtaining the inverse of the block circulant matrix S via Fourier transform operations comprises executing an inverse discrete Fourier transform on only one column of the block circulant matrix S multiplied by a discrete Fourier transform of a chip-level channel impulse vector h.
 42. The method of claim 38, wherein obtaining the inverse of the block circulant matrix S via Fourier transform operations comprises executing an inverse discrete Fourier transform on only one column of the block circulant matrix S and multiplying the result by a fast Fourier transform of a chip-level channel impulse vector h.
 43. The method of claim 38, wherein obtaining the inverse of the block circulant matrix S via Fourier transform operations comprises executing an inverse discrete Fourier transform on an inverted form of a first column of the block circulant matrix S and applying the result as frequency domain filter taps of the generated filter matrix w.
 44. The method of claim 43, further comprising adding a noise floor to the block circulant matrix S as a unit matrix multiplied by a constant.
 45. The method of claim 38, wherein obtaining the inverse of the block circulant matrix S via Fourier transform operations comprises increasing filter length of the filter w while performing the Fourier transform operations in the frequency domain and truncating the increased filter length after an inverse Fourier transform operation.
 46. The method of claim 37, wherein the received signal vector r(i) is received over M antennas where M is an integer greater than one, and wherein generating the filter matrix w without inverting the channel correlation matrix R comprises: converting the channel correlation matrix R to a block circulant matrix S; cyclically shifting at least a block column of the block circulant matrix S; and obtaining an inverse of the cyclically shifted block circulant matrix S via Fourier transform operations that are executed separately on each of M dimensions of the received signal.
 47. A storage medium tangibly embodying a program of machine-readable instructions that are executable by a computer processor to perform actions directed toward processing a received signal according to actions comprising: estimating a channel correlation matrix R of a received signal vector r(i); generating a filter matrix w without inverting the channel correlation matrix R; applying the generated filter matrix w to the received signal vector r(i) to find an estimated chip d(i); and outputting one of an audio signal or data from the estimated chip d(i).
 48. The storage medium of claim 47, wherein generating the filter matrix w without inverting the channel correlation matrix R comprises: converting the channel correlation matrix R to a block circulant matrix S; and obtaining an inverse of the block circulant matrix S via Fourier transform operations.
 49. The storage medium of claim 48, wherein converting the channel correlation matrix R to the block circulant matrix S comprises adding corner matrices to the channel correlation matrix R.
 50. The storage medium of claim 49, wherein the corner matrices are Hermitian conjugates of one another.
 51. The storage medium of claim 48, wherein obtaining the inverse of the block circulant matrix S via Fourier transform operations comprises executing an inverse discrete Fourier transform on only one column of the block circulant matrix S multiplied by a discrete Fourier transform of a chip-level channel impulse vector h.
 52. The storage medium of claim 48, wherein obtaining the inverse of the block circulant matrix S via Fourier transform operations comprises executing an inverse discrete Fourier transform on only one column of the block circulant matrix S and multiplying the result by a fast Fourier transform of a chip-level channel impulse vector h.
 53. The storage medium of claim 48, wherein obtaining the inverse of the block circulant matrix S via Fourier transform operations comprises executing an inverse discrete Fourier transform on an inverted form of a first column of the block circulant matrix S and applying the result as frequency domain filter taps of the generated filter matrix w.
 54. The storage medium of claim 43, the actions further comprising adding a noise floor to the block circulant matrix S as a unit matrix multiplied by a constant.
 55. The storage medium of claim 48, wherein obtaining the inverse of the block circulant matrix S via Fourier transform operations comprises increasing filter length of the filter w while performing the Fourier transform operations in the frequency domain and truncating the increased filter length after an inverse Fourier transform operation.
 56. The storage medium of claim 47, wherein the received signal vector r(i) is received over M antennas where M is an integer greater than one, and wherein generating the filter matrix w without inverting the channel correlation matrix R comprises: converting the channel correlation matrix R to a block circulant matrix S; cyclically shifting at least a block column of the block circulant matrix S; and obtaining an inverse of the cyclically shifted block circulant matrix S via Fourier transform operations that are executed separately on each of M dimensions of the received signal.
 57. A device comprising: a channel estimator configured to estimate a channel correlation matrix R of a received signal vector r(i); an equalizer configured to generate a filter matrix w without inverting the channel correlation matrix R, and for applying the generated filter matrix w to the received signal vector r(i) to output an estimated chip d(i)
 58. The device of claim 57, wherein the channel estimator is configured to generate the filter matrix w without inverting the channel correlation matrix R by: converting the channel correlation matrix R to a block circulant matrix S; and obtaining an inverse of the block circulant matrix S via Fourier transform operations.
 59. The device of claim 58, wherein the channel estimator is configured to convert the channel correlation matrix R to the block circulant matrix S by adding corner matrices to the channel correlation matrix R.
 60. The device of claim 59, wherein the corner matrices are Hermitian conjugates of one another.
 61. The device of claim 38, wherein the channel estimator is configured to obtain the inverse of the block circulant matrix S via Fourier transform operations by executing an inverse discrete Fourier transform on only one column of the block circulant matrix S multiplied by a discrete Fourier transform of a chip-level channel impulse vector h.
 62. The device of claim 58, wherein the channel estimator is configured to obtain the inverse of the block circulant matrix S via Fourier transform operations by executing an inverse discrete Fourier transform on only one column of the block circulant matrix S and multiplying the result by a fast Fourier transform of a chip-level channel impulse vector h.
 63. The method of claim 58, wherein the channel estimator is configured to obtain the inverse of the block circulant matrix S via Fourier transform operations by executing an inverse discrete Fourier transform on an inverted form of a first column of the block circulant matrix S and applying the result as frequency domain filter taps of the generated filter matrix w.
 64. The device of claim 63, wherein the channel estimator is further configured to add a noise floor to the block circulant matrix S as a unit matrix multiplied by a constant.
 65. The device of claim 58, wherein the channel estimator is configured to obtain the inverse of the block circulant matrix S via Fourier transform operations by increasing filter length of the filter w while performing the Fourier transform operations in the frequency domain and truncating the increased filter length after an inverse Fourier transform operation.
 66. The device of claim 57, wherein the received signal vector r(i) is received over M antennas where M is an integer greater than one, and wherein the channel estimator is configured to generate the filter matrix w without inverting the channel correlation matrix R by: converting the channel correlation matrix R to a block circulant matrix S; cyclically shifting at least a block column of the block circulant matrix S; and separately executing on each of M dimensions of the received signal the Fourier transform operations so as to obtain the inverse of the cyclically shifted block circulant matrix S. 